Assignment 2: Spatial Analysis and Visualization

Healthcare Access and Equity in Pennsylvania

Author

Joshua Rigsby

Published

October 6, 2025

Assignment Overview

Learning Objectives: - Apply spatial operations to answer policy-relevant research questions - Integrate census demographic data with spatial analysis - Create publication-quality visualizations and maps - Work with spatial data from multiple sources - Communicate findings effectively for policy audiences


Part 1: Healthcare Access for Vulnerable Populations

Research Question

Which Pennsylvania counties have the highest proportion of vulnerable populations (elderly + low-income) living far from hospitals?

Your analysis should identify counties that should be priorities for healthcare investment and policy intervention.

Required Analysis Steps

Complete the following analysis, documenting each step with code and brief explanations:

Step 1: Data Collection (5 points)

Load the required spatial data: - Pennsylvania county boundaries - Pennsylvania hospitals (from lecture data) - Pennsylvania census tracts

Your Task:

# Load required packages

library(dplyr)
library(tidyverse)
library(tidycensus)
library(knitr)
library(tigris)
library(sf)
library(broom)
library(scales)
library(patchwork)
library(here)
library(ggplot2)

## notation
#| include: false
options(scipen = 999)

# Load spatial data

pa_county_boundries = st_read("data/Pennsylvania_County_Boundaries.shp")
Reading layer `Pennsylvania_County_Boundaries' from data source 
  `/Users/JoshuaRigsby 1/Desktop/Upenn/MUSA_5080_Public Policy_Analytics/Portfolio_Files/portfolio-setup-jrigsbyr5/labs/assignment_2/data/Pennsylvania_County_Boundaries.shp' 
  using driver `ESRI Shapefile'
Simple feature collection with 67 features and 19 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: -8963377 ymin: 4825316 xmax: -8314404 ymax: 5201413
Projected CRS: WGS 84 / Pseudo-Mercator
pa_hospitals = st_read("data/hospitals.geojson")
Reading layer `hospitals' from data source 
  `/Users/JoshuaRigsby 1/Desktop/Upenn/MUSA_5080_Public Policy_Analytics/Portfolio_Files/portfolio-setup-jrigsbyr5/labs/assignment_2/data/hospitals.geojson' 
  using driver `GeoJSON'
Simple feature collection with 223 features and 11 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: -80.49621 ymin: 39.75163 xmax: -74.86704 ymax: 42.13403
Geodetic CRS:  WGS 84
census_tracts = tracts(state = "PA", cb = TRUE)

  |                                                                            
  |                                                                      |   0%
  |                                                                            
  |                                                                      |   1%
  |                                                                            
  |=                                                                     |   1%
  |                                                                            
  |=                                                                     |   2%
  |                                                                            
  |==                                                                    |   3%
  |                                                                            
  |===                                                                   |   4%
  |                                                                            
  |===                                                                   |   5%
  |                                                                            
  |====                                                                  |   5%
  |                                                                            
  |====                                                                  |   6%
  |                                                                            
  |=====                                                                 |   7%
  |                                                                            
  |=====                                                                 |   8%
  |                                                                            
  |======                                                                |   9%
  |                                                                            
  |=======                                                               |   9%
  |                                                                            
  |=======                                                               |  10%
  |                                                                            
  |=======                                                               |  11%
  |                                                                            
  |========                                                              |  11%
  |                                                                            
  |========                                                              |  12%
  |                                                                            
  |=========                                                             |  13%
  |                                                                            
  |==========                                                            |  14%
  |                                                                            
  |===========                                                           |  15%
  |                                                                            
  |===========                                                           |  16%
  |                                                                            
  |============                                                          |  17%
  |                                                                            
  |=============                                                         |  18%
  |                                                                            
  |=============                                                         |  19%
  |                                                                            
  |==============                                                        |  20%
  |                                                                            
  |===============                                                       |  21%
  |                                                                            
  |===============                                                       |  22%
  |                                                                            
  |================                                                      |  23%
  |                                                                            
  |=================                                                     |  24%
  |                                                                            
  |==================                                                    |  25%
  |                                                                            
  |==================                                                    |  26%
  |                                                                            
  |===================                                                   |  28%
  |                                                                            
  |====================                                                  |  29%
  |                                                                            
  |=====================                                                 |  30%
  |                                                                            
  |=====================                                                 |  31%
  |                                                                            
  |======================                                                |  32%
  |                                                                            
  |=======================                                               |  33%
  |                                                                            
  |========================                                              |  34%
  |                                                                            
  |=========================                                             |  35%
  |                                                                            
  |==========================                                            |  37%
  |                                                                            
  |==========================                                            |  38%
  |                                                                            
  |===========================                                           |  39%
  |                                                                            
  |============================                                          |  39%
  |                                                                            
  |============================                                          |  40%
  |                                                                            
  |=============================                                         |  41%
  |                                                                            
  |==============================                                        |  43%
  |                                                                            
  |===============================                                       |  44%
  |                                                                            
  |===============================                                       |  45%
  |                                                                            
  |================================                                      |  46%
  |                                                                            
  |=================================                                     |  47%
  |                                                                            
  |==================================                                    |  49%
  |                                                                            
  |===================================                                   |  50%
  |                                                                            
  |===================================                                   |  51%
  |                                                                            
  |====================================                                  |  52%
  |                                                                            
  |=====================================                                 |  53%
  |                                                                            
  |======================================                                |  54%
  |                                                                            
  |=======================================                               |  55%
  |                                                                            
  |=======================================                               |  56%
  |                                                                            
  |========================================                              |  57%
  |                                                                            
  |========================================                              |  58%
  |                                                                            
  |=========================================                             |  59%
  |                                                                            
  |==========================================                            |  59%
  |                                                                            
  |==========================================                            |  60%
  |                                                                            
  |===========================================                           |  61%
  |                                                                            
  |============================================                          |  62%
  |                                                                            
  |============================================                          |  63%
  |                                                                            
  |=============================================                         |  64%
  |                                                                            
  |==============================================                        |  65%
  |                                                                            
  |==============================================                        |  66%
  |                                                                            
  |===============================================                       |  67%
  |                                                                            
  |================================================                      |  68%
  |                                                                            
  |================================================                      |  69%
  |                                                                            
  |=================================================                     |  70%
  |                                                                            
  |==================================================                    |  71%
  |                                                                            
  |===================================================                   |  72%
  |                                                                            
  |===================================================                   |  73%
  |                                                                            
  |====================================================                  |  74%
  |                                                                            
  |=====================================================                 |  75%
  |                                                                            
  |=====================================================                 |  76%
  |                                                                            
  |======================================================                |  77%
  |                                                                            
  |=======================================================               |  78%
  |                                                                            
  |=======================================================               |  79%
  |                                                                            
  |========================================================              |  80%
  |                                                                            
  |=========================================================             |  81%
  |                                                                            
  |=========================================================             |  82%
  |                                                                            
  |==========================================================            |  83%
  |                                                                            
  |===========================================================           |  84%
  |                                                                            
  |===========================================================           |  85%
  |                                                                            
  |============================================================          |  86%
  |                                                                            
  |=============================================================         |  87%
  |                                                                            
  |=============================================================         |  88%
  |                                                                            
  |==============================================================        |  89%
  |                                                                            
  |===============================================================       |  90%
  |                                                                            
  |================================================================      |  91%
  |                                                                            
  |================================================================      |  92%
  |                                                                            
  |=================================================================     |  93%
  |                                                                            
  |==================================================================    |  94%
  |                                                                            
  |==================================================================    |  95%
  |                                                                            
  |===================================================================   |  96%
  |                                                                            
  |====================================================================  |  97%
  |                                                                            
  |====================================================================  |  98%
  |                                                                            
  |===================================================================== |  99%
  |                                                                            
  |======================================================================| 100%
# Check that all data loaded correctly

glimpse(pa_county_boundries)
Rows: 67
Columns: 20
$ OBJECTID   <int> 336, 337, 338, 339, 340, 341, 342, 343, 344, 345, 346, 347,…
$ MSLINK     <int> 46, 8, 9, 58, 59, 60, 62, 63, 42, 43, 44, 47, 48, 49, 50, 5…
$ COUNTY_NAM <chr> "MONTGOMERY", "BRADFORD", "BUCKS", "TIOGA", "UNION", "VENAN…
$ COUNTY_NUM <chr> "46", "08", "09", "58", "59", "60", "62", "63", "42", "43",…
$ FIPS_COUNT <chr> "091", "015", "017", "117", "119", "121", "125", "127", "08…
$ COUNTY_ARE <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ COUNTY_PER <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ NUMERIC_LA <int> 5, 2, 5, 2, 2, 3, 1, 2, 1, 3, 1, 2, 2, 2, 2, 2, 2, 2, 2, 1,…
$ COUNTY_N_1 <int> 46, 8, 9, 58, 59, 60, 62, 63, 42, 43, 44, 47, 48, 49, 50, 5…
$ AREA_SQ_MI <dbl> 487.4271, 1161.3379, 622.0836, 1137.2480, 319.1893, 683.367…
$ SOUND      <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ SPREAD_SHE <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ IMAGE_NAME <chr> "poll.bmp", "poll.bmp", "poll.bmp", "poll.bmp", "poll.bmp",…
$ NOTE_FILE  <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ VIDEO      <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ DISTRICT_N <chr> "06", "03", "06", "03", "03", "01", "12", "04", "02", "01",…
$ PA_CTY_COD <chr> "46", "08", "09", "59", "60", "61", "63", "64", "42", "43",…
$ MAINT_CTY_ <chr> "4", "9", "1", "7", "8", "5", "4", "6", "5", "4", "7", "3",…
$ DISTRICT_O <chr> "6-4", "3-9", "6-1", "3-7", "3-8", "1-5", "12-4", "4-6", "2…
$ geometry   <MULTIPOLYGON [m]> MULTIPOLYGON (((-8398884 48..., MULTIPOLYGON (…
glimpse(pa_hospitals)
Rows: 223
Columns: 12
$ CHIEF_EXEC <chr> "Peter J Adamo", "Autumn DeShields", "Shawn Parekh", "DIANE…
$ CHIEF_EX_1 <chr> "President", "Chief Executive Officer", "Chief Executive Of…
$ FACILITY_U <chr> "https://www.phhealthcare.org", "https://www.malvernbh.com"…
$ LONGITUDE  <dbl> -79.91131, -75.17005, -75.20963, -80.27907, -79.02513, -75.…
$ COUNTY     <chr> "Washington", "Philadelphia", "Philadelphia", "Washington",…
$ FACILITY_N <chr> "Penn Highlands Mon Valley", "MALVERN BEHAVIORAL HEALTH", "…
$ STREET     <chr> "1163 Country Club Road", "1930 South Broad Street Unit 4",…
$ CITY_OR_BO <chr> "Monongahela", "Philadelphia", "Philadelphia", "WASHINGTON"…
$ LATITUDE   <dbl> 40.18193, 39.92619, 40.02869, 40.15655, 39.80913, 40.24273,…
$ TELEPHONE_ <chr> "724-258-1000", "610-480-8919", "215-483-9900", "7248840710…
$ ZIP_CODE   <chr> "15063", "19145", "19128", "15301", "15552", "19464", "1776…
$ geometry   <POINT [°]> POINT (-79.91131 40.18193), POINT (-75.17005 39.9262)…
glimpse(census_tracts)
Rows: 3,445
Columns: 14
$ STATEFP    <chr> "42", "42", "42", "42", "42", "42", "42", "42", "42", "42",…
$ COUNTYFP   <chr> "001", "013", "013", "013", "013", "011", "011", "011", "01…
$ TRACTCE    <chr> "031101", "100400", "100500", "100800", "101900", "011200",…
$ GEOIDFQ    <chr> "1400000US42001031101", "1400000US42013100400", "1400000US4…
$ GEOID      <chr> "42001031101", "42013100400", "42013100500", "42013100800",…
$ NAME       <chr> "311.01", "1004", "1005", "1008", "1019", "112", "2", "115"…
$ NAMELSAD   <chr> "Census Tract 311.01", "Census Tract 1004", "Census Tract 1…
$ STUSPS     <chr> "PA", "PA", "PA", "PA", "PA", "PA", "PA", "PA", "PA", "PA",…
$ NAMELSADCO <chr> "Adams County", "Blair County", "Blair County", "Blair Coun…
$ STATE_NAME <chr> "Pennsylvania", "Pennsylvania", "Pennsylvania", "Pennsylvan…
$ LSAD       <chr> "CT", "CT", "CT", "CT", "CT", "CT", "CT", "CT", "CT", "CT",…
$ ALAND      <dbl> 3043185, 993724, 1130204, 996553, 573726, 1539365, 1949529,…
$ AWATER     <dbl> 0, 0, 0, 0, 0, 9308, 159015, 12469, 0, 0, 0, 1271, 6352, 74…
$ geometry   <MULTIPOLYGON [°]> MULTIPOLYGON (((-77.03108 3..., MULTIPOLYGON (…
# Checking CRS

st_crs(pa_county_boundries)
Coordinate Reference System:
  User input: WGS 84 / Pseudo-Mercator 
  wkt:
PROJCRS["WGS 84 / Pseudo-Mercator",
    BASEGEOGCRS["WGS 84",
        ENSEMBLE["World Geodetic System 1984 ensemble",
            MEMBER["World Geodetic System 1984 (Transit)"],
            MEMBER["World Geodetic System 1984 (G730)"],
            MEMBER["World Geodetic System 1984 (G873)"],
            MEMBER["World Geodetic System 1984 (G1150)"],
            MEMBER["World Geodetic System 1984 (G1674)"],
            MEMBER["World Geodetic System 1984 (G1762)"],
            MEMBER["World Geodetic System 1984 (G2139)"],
            MEMBER["World Geodetic System 1984 (G2296)"],
            ELLIPSOID["WGS 84",6378137,298.257223563,
                LENGTHUNIT["metre",1]],
            ENSEMBLEACCURACY[2.0]],
        PRIMEM["Greenwich",0,
            ANGLEUNIT["degree",0.0174532925199433]],
        ID["EPSG",4326]],
    CONVERSION["Popular Visualisation Pseudo-Mercator",
        METHOD["Popular Visualisation Pseudo Mercator",
            ID["EPSG",1024]],
        PARAMETER["Latitude of natural origin",0,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8801]],
        PARAMETER["Longitude of natural origin",0,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8802]],
        PARAMETER["False easting",0,
            LENGTHUNIT["metre",1],
            ID["EPSG",8806]],
        PARAMETER["False northing",0,
            LENGTHUNIT["metre",1],
            ID["EPSG",8807]]],
    CS[Cartesian,2],
        AXIS["easting (X)",east,
            ORDER[1],
            LENGTHUNIT["metre",1]],
        AXIS["northing (Y)",north,
            ORDER[2],
            LENGTHUNIT["metre",1]],
    USAGE[
        SCOPE["Web mapping and visualisation."],
        AREA["World between 85.06°S and 85.06°N."],
        BBOX[-85.06,-180,85.06,180]],
    ID["EPSG",3857]]
st_crs(pa_hospitals)
Coordinate Reference System:
  User input: WGS 84 
  wkt:
GEOGCRS["WGS 84",
    DATUM["World Geodetic System 1984",
        ELLIPSOID["WGS 84",6378137,298.257223563,
            LENGTHUNIT["metre",1]]],
    PRIMEM["Greenwich",0,
        ANGLEUNIT["degree",0.0174532925199433]],
    CS[ellipsoidal,2],
        AXIS["geodetic latitude (Lat)",north,
            ORDER[1],
            ANGLEUNIT["degree",0.0174532925199433]],
        AXIS["geodetic longitude (Lon)",east,
            ORDER[2],
            ANGLEUNIT["degree",0.0174532925199433]],
    ID["EPSG",4326]]
st_crs(census_tracts)
Coordinate Reference System:
  User input: NAD83 
  wkt:
GEOGCRS["NAD83",
    DATUM["North American Datum 1983",
        ELLIPSOID["GRS 1980",6378137,298.257222101,
            LENGTHUNIT["metre",1]]],
    PRIMEM["Greenwich",0,
        ANGLEUNIT["degree",0.0174532925199433]],
    CS[ellipsoidal,2],
        AXIS["latitude",north,
            ORDER[1],
            ANGLEUNIT["degree",0.0174532925199433]],
        AXIS["longitude",east,
            ORDER[2],
            ANGLEUNIT["degree",0.0174532925199433]],
    ID["EPSG",4269]]

Questions to answer: - How many hospitals are in your dataset? 219 Hospitals - How many census tracts? 3445 Census Tracts - What coordinate reference system is each dataset in? pa_county_boundries = EPSG: 3857 pa_hospitals = EPSG: 4326 census_tracts = EPSG: 4269 —

Step 2: Get Demographic Data

Use tidycensus to download tract-level demographic data for Pennsylvania.

Required variables: - Total population - Median household income - Population 65 years and over (you may need to sum multiple age categories)

Your Task:

# Get demographic data from ACS

pa_acs_data = get_acs(
  geography = "tract",
  state = "PA",
  variables = c(
    total_pop = "B01003_001",
    median_income = "B19013_001",
    male_65_66 = "B01001_020",
    male_67_69 = "B01001_021",
    male_70_74 = "B01001_022",
    male_75_79 = "B01001_023",
    male_80_84 = "B01001_024",
    male_over_85 = "B01001_025",
    female_65_66 = "B01001_044",
    female_67_69 = "B01001_045",
    female_70_74 = "B01001_046",
    female_75_79 = "B01001_047",
    female_80_84 = "B01001_048",
    female_over_85 = "B01001_049"
    ),
  year = 2022,
  survey = "acs5",
  output = "wide"
)

# Column for total 65+ population

pa_acs_data = pa_acs_data %>%
  mutate(over_65= male_65_66E + male_67_69E + male_70_74E + male_75_79E + male_80_84E + male_over_85E +female_65_66E + female_67_69E + female_70_74E + female_75_79E + female_80_84E + female_over_85E
  )

# Answering the question asking about missing income data and median income across all tracts

missing = pa_acs_data %>%
  summarise(count_missing = sum(is.na(median_incomeE)))

missing
# A tibble: 1 × 1
  count_missing
          <int>
1            63
mean_median_income = pa_acs_data %>%
  summarise(mean_income = mean(median_incomeE, na.rm = TRUE))

mean_median_income
# A tibble: 1 × 1
  mean_income
        <dbl>
1      77527.
# Join to tract boundaries

acs_tracts = census_tracts %>%
  left_join(pa_acs_data, by = "GEOID")

Questions to answer: - What year of ACS data are you using? 2022 - How many tracts have missing income data? 63 - What is the median income across all PA census tracts? Im assuming this is asking for the mean across all tracts? that would be: $77527.23


Step 3: Define Vulnerable Populations

Identify census tracts with vulnerable populations based on TWO criteria: 1. Low median household income (choose an appropriate threshold) 2. Significant elderly population (choose an appropriate threshold)

Your Task:

# Filter for vulnerable tracts based on your criteria
# First make income categories

acs_tracts = acs_tracts %>%
  mutate(
    income_category = case_when(
      median_incomeE < 50000 ~ "Low",
      median_incomeE >= 50000 & median_incomeE <= 60000 ~ "Moderate",
      median_incomeE > 60000 ~ "High",
      TRUE ~ NA_character_  
    )
  )

# Create percentages for populations over 65

acs_tracts = acs_tracts %>%
  mutate(pct_over_65 = (over_65 / total_popE) * 100
  )

# I want to get the mean and median and range for pct_over_65 to decide what threshold to choose.

acs_tracts %>%
  st_drop_geometry() %>%  # added because of an error
  summarise(
    mean_pct_over_65 = mean(pct_over_65, na.rm = TRUE),
    median_pct_over_65 = median(pct_over_65, na.rm = TRUE),
    min_pct_over_65 = min(pct_over_65, na.rm = TRUE),
    max_pct_over_65 = max(pct_over_65, na.rm = TRUE)
  )
  mean_pct_over_65 median_pct_over_65 min_pct_over_65 max_pct_over_65
1         19.03108           18.95003               0             100
#Filter for both categories

vulnerable_tracts = acs_tracts %>%
  filter(
    median_incomeE < 50000,   
    pct_over_65 >= 25         
  )

Questions to answer: - What income threshold did you choose and why? I chose under $50000 to designate low median household income, because its measured by median not mean so there will be true population values in the tract above and below this number not represented by the median household income.

  • What elderly population threshold did you choose and why The mean and median are very close to 19%, the max is 100 and min is 0 so I will select above 25% as a significant elderly population.

  • How many tracts meet your vulnerability criteria? 78

  • What percentage of PA census tracts are considered vulnerable by your definition? 2.26%

Step 4: Calculate Distance to Hospitals

For each vulnerable tract, calculate the distance to the nearest hospital.

Your Task:

# Transform to appropriate projected CRS

acs_tracts_2272 = st_transform(acs_tracts, crs = 2272)
pa_county_boundries_2272 = st_transform(pa_county_boundries, crs = 2272)
pa_hospitals_2272 = st_transform(pa_hospitals, crs = 2272)
vulnerable_tracts_2272 = st_transform(vulnerable_tracts, crs = 2272)

# Calculate distance from each tract centroid to nearest hospital

vulnerable_tract_centers = st_centroid(vulnerable_tracts_2272)
hospital_distances = st_distance(vulnerable_tract_centers, pa_hospitals_2272)
nearest_hospital = apply(hospital_distances, 1, min)

nearest_hospital_miles = nearest_hospital / 5280

print(nearest_hospital_miles)
 [1]  0.41510589  9.50503483  3.37503168  0.55181370  3.40121100  3.68435140
 [7]  2.12554246 15.90779688  1.65665104  0.05910259  3.30890759  2.20586101
[13]  1.29067519  1.13403552  1.61945874  1.92938653  0.51076746  3.68049831
[19]  0.92779086  3.21715121  0.28645992  1.92554453  2.16332285  1.90760586
[25]  3.91828507  3.34083463  5.60172043  1.80392088  1.53585610  9.32759085
[31]  2.77265601  2.93731692  5.91646819  3.79675692  0.86689776  3.90044346
[37]  4.11604625  0.33619256 12.19015791  0.25621353  1.02327229  1.55691297
[43]  1.22358078  4.37947780  4.35986661  5.54086971 10.97184544  3.86405813
[49]  2.52786005 11.59571747  0.50215772  0.13280767  0.95815426  0.21091054
[55]  1.01204374  2.46632814  3.40631213 12.37328141  0.10158156  3.18689116
[61]  2.07097077  3.76597486  0.44852300  4.93024368 14.52222584  1.06116615
[67]  0.07692871  1.92903045 17.29263641  7.02710359  1.20114776  1.96923780
[73]  0.15025817  1.02165132  4.11745557 14.14109052 18.13079998  7.71468810
# For question about average

mean_distance = mean(nearest_hospital_miles)
print(mean_distance)
[1] 3.876558
# Merge Together
vulnerable_tracts_2272_all = vulnerable_tracts_2272 %>%
  mutate(
    centroid = st_geometry(vulnerable_tract_centers),
    nearest_hospital_miles = nearest_hospital_miles
  )

Requirements: - Use an appropriate projected coordinate system for Pennsylvania - Calculate distances in miles - Explain why you chose your projection - I choose EPSG 2272 because it works well for data in Philadelphia

Questions to answer: - What is the average distance to the nearest hospital for vulnerable tracts 2.17 Miles - What is the maximum distance? 10.97 Miles - How many vulnerable tracts are more than 15 miles from the nearest hospital? 0


Step 5: Identify Underserved Areas

Define “underserved” as vulnerable tracts that are more than 15 miles from the nearest hospital.

Your Task:

# Create underserved variable

underserved = vulnerable_tracts_2272_all[nearest_hospital_miles > 15, ]

# Even though I have no underserved tracts I will still add it to my data set 

vulnerable_tracts_2272_all = vulnerable_tracts_2272_all %>%
  mutate(underserved = nearest_hospital_miles > 15)

Questions to answer: - How many tracts are underserved? 3 - What percentage of vulnerable tracts are undeserved? 3.8% - Does this surprise you? Why or why not? Yes, this is surprising, perhaps its due to the parameters I used to define vulnerable but, I would think that especially in rural tratcts in Pennsylvania that there would be a higher percentage of vulnerable people more than 15 miles from the nearest hospital. —

Step 6: Aggregate to County Level

Use spatial joins and aggregation to calculate county-level statistics about vulnerable populations and hospital access.

Your Task:

# Spatial join tracts to counties

tracts_counties = st_join(vulnerable_tracts_2272_all, pa_county_boundries_2272)

# Aggregate statistics by county

statistics_by_county <- tracts_counties %>%
  st_drop_geometry() %>%                                # drop geometry
  group_by(COUNTY_NAM) %>%                              # group by county name
  summarise(
    vulnerable_tracts = n(),  # number of vulnerable tracts
    underserved_tracts = sum(underserved, na.rm = TRUE),  # number of undeserved tracts
    pct_underserved = 100 * underserved_tracts / vulnerable_tracts,  # percentage of vulnerable tracts that are undeserved
    avg_distance_to_hospital_vulnerable = mean(nearest_hospital_miles, na.rm = TRUE),  # average distance for vulnerable tracts
    total_vulnerable_pop = sum(over_65, na.rm = TRUE) # total vulnerable population
  )

statistics_by_county
# A tibble: 42 × 6
   COUNTY_NAM vulnerable_tracts underserved_tracts pct_underserved
   <chr>                  <int>              <int>           <dbl>
 1 ALLEGHENY                 16                  0               0
 2 BEAVER                     3                  0               0
 3 BEDFORD                    1                  0               0
 4 BLAIR                      2                  0               0
 5 CAMBRIA                    5                  0               0
 6 CAMERON                    1                  0               0
 7 CARBON                     1                  0               0
 8 CENTRE                     1                  0               0
 9 CLARION                    1                  1             100
10 CLEARFIELD                 2                  1              50
# ℹ 32 more rows
# ℹ 2 more variables: avg_distance_to_hospital_vulnerable <dbl>,
#   total_vulnerable_pop <dbl>

Required county-level statistics: - Number of vulnerable tracts - Number of underserved tracts - Percentage of vulnerable tracts that are underserved - Average distance to nearest hospital for vulnerable tracts - Total vulnerable population

Questions to answer: - Which 5 counties have the highest percentage of underserved vulnerable tracts? CLARION, CLEARFIELD, JUNIATA, PERRY, FRANKLIN - Which counties have the most vulnerable people living far from hospitals? NORTHUMBERLAND, ALLEGHENY, CAMBRIA, LEHIGH - Are there any patterns in where undeserved counties are located? These are all rural counties located in central and south central PA

Step 7: Create Summary Table

Create a professional table showing the top 10 priority counties for healthcare investment.

Your Task:

# Create and format priority counties table

priority_counties <- c(
  "ALLEGHENY", "CAMBRIA", "FRANKLIN", "MERCER", "CLARION",
  "PERRY", "LEHIGH", "NORTHUMBERLAND", "CLEARFIELD", "JUNIATA"
)

# Filter for these counties and format numbers
top_10_counties = statistics_by_county %>%
  filter(COUNTY_NAM %in% priority_counties) %>%
  mutate(
    vulnerable_tracts = (vulnerable_tracts),
    avg_distance_to_hospital_vulnerable = round(avg_distance_to_hospital_vulnerable, 2),
    total_vulnerable_pop = scales::comma(total_vulnerable_pop)
  ) %>%
  arrange(desc(vulnerable_tracts))  # sort by number of vulnerable tracts

# Create table
top_10_counties %>%
  select(
    COUNTY_NAM,
    vulnerable_tracts,
    avg_distance_to_hospital_vulnerable,
    total_vulnerable_pop
  ) %>%
  kable(
    caption = "Top 10 Priority Counties for Healthcare Investment",
    col.names = c(
      "County",
      "Vulnerable Tracts",
      "Avg Distance to Hospital (mi)",
      "Total Vulnerable Population"
    ),
    format = "html"
  )
Top 10 Priority Counties for Healthcare Investment
County Vulnerable Tracts Avg Distance to Hospital (mi) Total Vulnerable Population
ALLEGHENY 16 2.81 9,854
CAMBRIA 5 5.13 3,324
MERCER 3 2.18 3,147
CLEARFIELD 2 11.11 1,830
FRANKLIN 2 8.57 1,624
NORTHUMBERLAND 2 10.55 1,149
CLARION 1 18.13 802
JUNIATA 1 15.91 492
LEHIGH 1 3.41 1,412
PERRY 1 15.91 492

Requirements: - Use knitr::kable() or similar for formatting - Include descriptive column names - Format numbers appropriately (commas for population, percentages, etc.) - Add an informative caption - Sort by priority (you decide the metric)

Part 2: Comprehensive Visualization

Using the skills from Week 3 (Data Visualization), create publication-quality maps and charts.

Map 1: County-Level Choropleth

Create a choropleth map showing healthcare access challenges at the county level.

Your Task:

# Create county-level access map

#First bring geometry back from before I made county statistics table
statistics_by_county_sf = statistics_by_county %>%
  left_join(
    select(tracts_counties, COUNTY_NAM, geometry),  # only need COUNTY_NAM + geometry
    by = "COUNTY_NAM"
  ) %>%
  st_as_sf()  # convert back to sf

ggplot() +
  geom_sf(data = pa_county_boundries_2272, fill = "NA", color = "black") +
  geom_sf(data = statistics_by_county_sf, aes(fill = pct_underserved), color = "white") +
  geom_sf(data = pa_hospitals_2272, shape = 21, fill = "red", size = .25) +
  scale_fill_continuous(labels = scales::percent_format(accuracy = 1)) + # formated labels
  labs(
    title = "Healthcare Access Challenges At Pennslyvania County Level",
    subtitle = "Fill = Percentage of vulnerable tracts that are underserved",
  ) +
  theme_void()

Requirements: - Fill counties by percentage of vulnerable tracts that are undeserved - Include hospital locations as points - Use an appropriate color scheme - Include clear title, subtitle, and caption - Use theme_void() or similar clean theme - Add a legend with formatted labels


Map 2: Detailed Vulnerability Map

Create a map highlighting undeserved vulnerable tracts.

Your Task:

# Create detailed tract-level map

ggplot() +
  # All vulnerable tracts in gray
  geom_sf(data = vulnerable_tracts_2272_all, fill = "lightgray", color = NA) +
  # Undeserved tracts highlighted in Purple
  geom_sf(
    data = subset(vulnerable_tracts_2272_all, underserved == TRUE),
    fill = "purple",
    color = NA
  ) +
  geom_sf(data = pa_county_boundries_2272, fill = "NA", color = "black", size = 0.5) +
  geom_sf(data = pa_hospitals_2272, shape = 21, fill = "black", size = 1) +
  labs(
    title = "Underserved Vulnerable Tracts in Pennsylvania",
    subtitle = "Highlighted tracts = underserved tracts,counties and hospitals shown for context"
  ) +
  theme_void()

Requirements: - Show undeserved vulnerable tracts in a contrasting color - Include county boundaries for context - Show hospital locations - Use appropriate visual hierarchy (what should stand out?) - Include informative title and subtitle


Chart: Distribution Analysis

Create a visualization showing the distribution of distances to hospitals for vulnerable populations.

Your Task:

# Create distribution visualization

ggplot(vulnerable_tracts_2272_all, aes(x = total_popE, y = nearest_hospital_miles)) +
  geom_point(alpha = 0.6, color = "darkgreen", size = 2) +
  labs(
    title = "Distance to Hospital vs. Vulnerable Population Size",
    x = "Vulnerable Population at Tract Level",
    y = "Distance to Nearest Hospital in miles"
  ) +
  theme_classic()

Suggested chart types: - Histogram or density plot of distances - Box plot comparing distances across regions - Bar chart of underserved tracts by county - Scatter plot of distance vs. vulnerable population size

Requirements: - Clear axes labels with units - Appropriate title - Professional formatting - Brief interpretation (1-2 sentences as a caption or in text)


Part 3: Bring Your Own Data Analysis

Choose your own additional spatial dataset and conduct a supplementary analysis.

Challenge Options

Choose ONE of the following challenge exercises, or propose your own research question using OpenDataPhilly data (https://opendataphilly.org/datasets/).

Note these are just loose suggestions to spark ideas - follow or make your own as the data permits and as your ideas evolve. This analysis should include bringing in your own dataset, ensuring the projection/CRS of your layers align and are appropriate for the analysis (not lat/long or geodetic coordinate systems). The analysis portion should include some combination of spatial and attribute operations to answer a relatively straightforward question


Education & Youth Services

Option A: Educational Desert Analysis - Data: Schools, Libraries, Recreation Centers, Census tracts (child population) - Question: “Which neighborhoods lack adequate educational infrastructure for children?” - Operations: Buffer schools/libraries (0.5 mile walking distance), identify coverage gaps, overlay with child population density - Policy relevance: School district planning, library placement, after-school program siting

Option B: School Safety Zones - Data: Schools, Crime Incidents, Bike Network - Question: “Are school zones safe for walking/biking, or are they crime hotspots?” - Operations: Buffer schools (1000ft safety zone), spatial join with crime incidents, assess bike infrastructure coverage - Policy relevance: Safe Routes to School programs, crossing guard placement


Environmental Justice

Option C: Green Space Equity - Data: Parks, Street Trees, Census tracts (race/income demographics) - Question: “Do low-income and minority neighborhoods have equitable access to green space?” - Operations: Buffer parks (10-minute walk = 0.5 mile), calculate tree canopy or park acreage per capita, compare by demographics - Policy relevance: Climate resilience, environmental justice, urban forestry investment —

Public Safety & Justice

Option D: Crime & Community Resources - Data: Crime Incidents, Recreation Centers, Libraries, Street Lights - Question: “Are high-crime areas underserved by community resources?” - Operations: Aggregate crime counts to census tracts or neighborhoods, count community resources per area, spatial correlation analysis - Policy relevance: Community investment, violence prevention strategies —

Infrastructure & Services

Option E: Polling Place Accessibility - Data: Polling Places, SEPTA stops, Census tracts (elderly population, disability rates) - Question: “Are polling places accessible for elderly and disabled voters?” - Operations: Buffer polling places and transit stops, identify vulnerable populations, find areas lacking access - Policy relevance: Voting rights, election infrastructure, ADA compliance


Health & Wellness

Option F: Recreation & Population Health - Data: Recreation Centers, Playgrounds, Parks, Census tracts (demographics) - Question: “Is lack of recreation access associated with vulnerable populations?” - Operations: Calculate recreation facilities per capita by neighborhood, buffer facilities for walking access, overlay with demographic indicators - Policy relevance: Public health investment, recreation programming, obesity prevention


Emergency Services

Option G: EMS Response Coverage - Data: Fire Stations, EMS stations, Population density, High-rise buildings - Question: “Are population-dense areas adequately covered by emergency services?” - Operations: Create service area buffers (5-minute drive = ~2 miles), assess population coverage, identify gaps in high-density areas - Policy relevance: Emergency preparedness, station siting decisions


Arts & Culture

Option H: Cultural Asset Distribution - Data: Public Art, Museums, Historic sites/markers, Neighborhoods - Question: “Do all neighborhoods have equitable access to cultural amenities?” - Operations: Count cultural assets per neighborhood, normalize by population, compare distribution across demographic groups - Policy relevance: Cultural equity, tourism, quality of life, neighborhood identity


Data Sources

OpenDataPhilly: https://opendataphilly.org/datasets/ - Most datasets available as GeoJSON, Shapefile, or CSV with coordinates - Always check the Metadata for a data dictionary of the fields.

Additional Sources: - Pennsylvania Open Data: https://data.pa.gov/ - Census Bureau (via tidycensus): Demographics, economic indicators, commute patterns - TIGER/Line (via tigris): Geographic boundaries

Your Analysis

Your Task:

  1. Find and load additional data
    • Document your data source
    • Check and standardize the CRS
    • Provide basic summary statistics
# Load your additional dataset

neighborhoods = st_read("data/philadelphia-neighborhoods.geojson")
Reading layer `philadelphia-neighborhoods' from data source 
  `/Users/JoshuaRigsby 1/Desktop/Upenn/MUSA_5080_Public Policy_Analytics/Portfolio_Files/portfolio-setup-jrigsbyr5/labs/assignment_2/data/philadelphia-neighborhoods.geojson' 
  using driver `GeoJSON'
Simple feature collection with 159 features and 5 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: -75.28026 ymin: 39.86701 xmax: -74.95576 ymax: 40.13799
Geodetic CRS:  WGS 84
public_art = st_read("data/percent_for_art_public.geojson")
Reading layer `percent_for_art_public' from data source 
  `/Users/JoshuaRigsby 1/Desktop/Upenn/MUSA_5080_Public Policy_Analytics/Portfolio_Files/portfolio-setup-jrigsbyr5/labs/assignment_2/data/percent_for_art_public.geojson' 
  using driver `GeoJSON'
Simple feature collection with 239 features and 13 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: -75.25883 ymin: 39.8764 xmax: -75.09908 ymax: 40.08824
Geodetic CRS:  WGS 84
historic_sites = st_read("data/historic_sites_philreg.geojson")
Reading layer `historic_sites_philreg' from data source 
  `/Users/JoshuaRigsby 1/Desktop/Upenn/MUSA_5080_Public Policy_Analytics/Portfolio_Files/portfolio-setup-jrigsbyr5/labs/assignment_2/data/historic_sites_philreg.geojson' 
  using driver `GeoJSON'
Simple feature collection with 23375 features and 7 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: -75.25813 ymin: 39.87172 xmax: -74.96933 ymax: 40.12603
Geodetic CRS:  WGS 84
landmarks = st_read("data/Landmark_Points.geojson")
Reading layer `Landmark_Points' from data source 
  `/Users/JoshuaRigsby 1/Desktop/Upenn/MUSA_5080_Public Policy_Analytics/Portfolio_Files/portfolio-setup-jrigsbyr5/labs/assignment_2/data/Landmark_Points.geojson' 
  using driver `GeoJSON'
Simple feature collection with 1147 features and 20 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: -75.28169 ymin: 39.85929 xmax: -74.96915 ymax: 40.11991
Geodetic CRS:  WGS 84
st_crs(neighborhoods)
Coordinate Reference System:
  User input: WGS 84 
  wkt:
GEOGCRS["WGS 84",
    DATUM["World Geodetic System 1984",
        ELLIPSOID["WGS 84",6378137,298.257223563,
            LENGTHUNIT["metre",1]]],
    PRIMEM["Greenwich",0,
        ANGLEUNIT["degree",0.0174532925199433]],
    CS[ellipsoidal,2],
        AXIS["geodetic latitude (Lat)",north,
            ORDER[1],
            ANGLEUNIT["degree",0.0174532925199433]],
        AXIS["geodetic longitude (Lon)",east,
            ORDER[2],
            ANGLEUNIT["degree",0.0174532925199433]],
    ID["EPSG",4326]]
st_crs(public_art)
Coordinate Reference System:
  User input: WGS 84 
  wkt:
GEOGCRS["WGS 84",
    DATUM["World Geodetic System 1984",
        ELLIPSOID["WGS 84",6378137,298.257223563,
            LENGTHUNIT["metre",1]]],
    PRIMEM["Greenwich",0,
        ANGLEUNIT["degree",0.0174532925199433]],
    CS[ellipsoidal,2],
        AXIS["geodetic latitude (Lat)",north,
            ORDER[1],
            ANGLEUNIT["degree",0.0174532925199433]],
        AXIS["geodetic longitude (Lon)",east,
            ORDER[2],
            ANGLEUNIT["degree",0.0174532925199433]],
    ID["EPSG",4326]]
st_crs(historic_sites)
Coordinate Reference System:
  User input: WGS 84 
  wkt:
GEOGCRS["WGS 84",
    DATUM["World Geodetic System 1984",
        ELLIPSOID["WGS 84",6378137,298.257223563,
            LENGTHUNIT["metre",1]]],
    PRIMEM["Greenwich",0,
        ANGLEUNIT["degree",0.0174532925199433]],
    CS[ellipsoidal,2],
        AXIS["geodetic latitude (Lat)",north,
            ORDER[1],
            ANGLEUNIT["degree",0.0174532925199433]],
        AXIS["geodetic longitude (Lon)",east,
            ORDER[2],
            ANGLEUNIT["degree",0.0174532925199433]],
    ID["EPSG",4326]]
st_crs(landmarks)
Coordinate Reference System:
  User input: WGS 84 
  wkt:
GEOGCRS["WGS 84",
    DATUM["World Geodetic System 1984",
        ELLIPSOID["WGS 84",6378137,298.257223563,
            LENGTHUNIT["metre",1]]],
    PRIMEM["Greenwich",0,
        ANGLEUNIT["degree",0.0174532925199433]],
    CS[ellipsoidal,2],
        AXIS["geodetic latitude (Lat)",north,
            ORDER[1],
            ANGLEUNIT["degree",0.0174532925199433]],
        AXIS["geodetic longitude (Lon)",east,
            ORDER[2],
            ANGLEUNIT["degree",0.0174532925199433]],
    ID["EPSG",4326]]

Questions to answer: - What dataset did you choose and why? I chose neighborhood boundries, public art, landmarks, and historic sites to identify neighborhoods that have better access to cultural amenities and neighborhoods that lack cultural amenities. - What is the data source and date? Source for all data sets: OpenDataPhilly - https://opendataphilly.org/datasets/, public art: July 2, 2020, neighborhood boundaries: July 2, 2020, landmarks: 11/01/2017, historic sites: N/A
- How many features does it contain? neighborhoods = 159, public art = 239, landmarks = 1147, historic sites = 23375 - What CRS is it in? Did you need to transform it? They are all in EPSG 4326, no need to transform.


  1. Pose a research question

What neighborhoods in Philadelphia lack access and proximity to cultural amenities?

Examples: - “Do vulnerable tracts have adequate public transit access to hospitals?” - “Are EMS stations appropriately located near vulnerable populations?” - “Do areas with low vehicle access have worse hospital access?”


  1. Conduct spatial analysis

Use at least TWO spatial operations to answer your research question.

Required operations (choose 2+): - Buffers - Spatial joins - Spatial filtering with predicates - Distance calculations - Intersections or unions - Point-in-polygon aggregation

Your Task:

# Your spatial analysis


# I had to add this because the public art data has alot of incorrect ploygons,
# but after trouble shooting it was still causing an error on line 672, and I still    
# wanted to include it in my analysis.

sf::sf_use_s2(FALSE)

# Spatial Join
# First, join each amenity to neighborhoods separately this gave me 3 new spatial data frames.
public_art_neighborhood = st_join(public_art, neighborhoods)
historic_sites_neighborhhod = st_join(historic_sites, neighborhoods)
landmarks_neighborhhod = st_join(landmarks, neighborhoods)

# Count per neighborhood for each amenity - Point-in-polygon aggregation

# Public art
public_art_counts <- public_art_neighborhood %>%
  st_set_geometry(NULL) %>%
  group_by(NAME) %>%                       # Use NAME column from public_art
  summarise(public_art_count = n())

# Historic sites
historic_counts <- historic_sites_neighborhhod %>%
  st_set_geometry(NULL) %>%
  group_by(NAME) %>%                       # Use NAME column from historic_sites
  summarise(historic_count = n())

# Landmarks
landmark_counts <- landmarks_neighborhhod %>%
  st_set_geometry(NULL) %>%
  group_by(NAME.y) %>%                     # Use NAME.y column from landmarks
  summarise(landmark_count = n())

# Make the counts into one data set
neighborhood_counts = neighborhoods %>%
  st_set_geometry(NULL) %>%
  select(NEIGHBORHOOD = NAME) %>%
  left_join(public_art_counts, by = c("NEIGHBORHOOD" = "NAME")) %>%
  left_join(historic_counts, by = c("NEIGHBORHOOD" = "NAME")) %>%
  left_join(landmark_counts, by = c("NEIGHBORHOOD" = "NAME.y")) %>%
  mutate(
    public_art_count = replace_na(public_art_count, 0),
    historic_count = replace_na(historic_count, 0),
    landmark_count = replace_na(landmark_count, 0)
  )


# Kable with counts

kable(
  neighborhood_counts,
  col.names = c("Neighborhood", "Public Art", "Historic Sites", "Landmarks"),
  caption = "Cultural Assets per Neighborhood"
)
Cultural Assets per Neighborhood
Neighborhood Public Art Historic Sites Landmarks
BRIDESBURG 0 0 4
BUSTLETON 0 4 2
CEDARBROOK 0 1 1
CHESTNUT_HILL 3 96 37
EAST_FALLS 1 229 9
MOUNT_AIRY_EAST 0 51 2
GRAYS_FERRY 0 2 8
OLNEY 0 6 5
PENNYPACK_PARK 0 4 14
SOMERTON 0 1 1
MOUNT_AIRY_WEST 0 104 4
WEST_OAK_LANE 0 0 1
WISSAHICKON_PARK 0 11 39
RIVERFRONT 2 316 27
BYBERRY 0 1 5
WEST_TORRESDALE 0 0 0
MECHANICSVILLE 0 0 0
PARKWOOD_MANOR 0 3 3
FRANKLIN_MILLS 0 0 0
NORTHEAST_AIRPORT 0 0 0
MODENA 0 0 1
MORRELL_PARK 0 0 1
MILLBROOK 0 0 4
CRESTMONT_FARMS 0 4 1
ACADEMY_GARDENS 0 0 0
PENNYPACK 0 1 1
ASTON_WOODBRIDGE 0 0 2
PENNYPACK_WOODS 0 21 2
WINCHESTER_PARK 0 0 2
TORRESDALE 0 4 9
LEXINGTON_PARK 0 0 1
RHAWNHURST 0 1 3
FOX_CHASE 0 7 9
BURHOLME 0 2 2
OXFORD_CIRCLE 0 1 3
SUMMERDALE 0 0 1
CRESCENTVILLE 0 0 1
LAWNDALE 1 5 2
NORTHWOOD 0 23 3
HOLMESBURG 0 7 7
FRANKFORD 0 11 4
MAYFAIR 0 0 3
TACONY 0 1 5
WISSINOMING 0 10 2
ANDORRA 0 0 6
UPPER_ROXBOROUGH 0 3 17
DEARNLEY_PARK 0 6 6
ROXBOROUGH_PARK 0 3 0
GERMANY_HILL 0 2 2
WISSAHICKON_HILLS 0 0 0
ROXBOROUGH 0 7 3
MANAYUNK 0 449 7
WISSAHICKON 0 27 0
GERMANTOWN_EAST 0 38 2
GERMANTOWN_MORTON 2 12 0
GERMANTOWN_WEST_CENT 0 46 3
GERMANTOWN_PENN_KNOX 3 142 3
GERMANTOWN_WESTSIDE 0 4 0
GERMANTOWN_SOUTHWEST 0 10 0
OGONTZ 0 1 10
WISTER 1 65 2
NICETOWN 1 41 0
TIOGA 0 1 3
ALLEGHENY_WEST 0 4 6
GLENWOOD 0 0 1
FERN_ROCK 0 0 3
EAST_OAK_LANE 0 0 1
MELROSE_PARK_GARDENS 0 0 1
FRANKLINVILLE 0 2 1
FELTONVILLE 0 0 1
RICHMOND 0 0 8
PORT_RICHMOND 0 0 1
HUNTING_PARK 0 2 5
JUNIATA_PARK 0 21 3
HARROWGATE 0 0 4
FAIRHILL 0 0 0
UPPER_KENSINGTON 0 2 2
MCGUIRE 0 0 1
STANTON 1 99 2
BREWERYTOWN 1 2 2
SHARSWOOD 0 1 4
NORTH_CENTRAL 4 189 4
YORKTOWN 4 0 6
LUDLOW 1 1 1
HARTRANFT 10 14 29
WEST_KENSINGTON 1 7 5
FISHTOWN 0 47 8
OLD_KENSINGTON 0 10 4
NORTHERN_LIBERTIES 7 38 7
LOGAN 0 33 13
SOCIETY_HILL 33 3472 11
OLD_CITY 22 3117 27
CHINATOWN 0 211 1
CENTER_CITY 4 177 12
WASHINGTON_SQUARE 21 1659 15
FAIRMOUNT 0 147 2
FRANCISVILLE 2 30 2
SPRING_GARDEN 2 2083 3
LOGAN_SQUARE 15 505 66
RITTENHOUSE 2 5995 21
FITLER_SQUARE 0 289 5
GRADUATE_HOSPITAL 1 1123 5
POINT_BREEZE 0 3 3
HAWTHORNE 1 10 1
BELLA_VISTA 0 23 5
QUEEN_VILLAGE 0 747 4
DICKINSON_NARROWS 0 6 0
PASSYUNK_SQUARE 1 0 4
GREENWICH 0 0 1
LOWER_MOYAMENSING 0 5 1
WHITMAN 0 0 1
INDUSTRIAL 6 0 8
AIRPORT 3 1 22
CLEARVIEW 0 1 0
PENROSE 3 0 2
PASCHALL 0 4 3
BARTRAM_VILLAGE 0 1 4
KINGSESSING 0 10 3
COBBS_CREEK 0 0 2
WALNUT_HILL 0 2 1
CEDAR_PARK 0 55 1
GARDEN_COURT 0 0 1
WOODLAND_TERRACE 0 48 3
SPRUCE_HILL 0 61 2
SOUTHWEST_SCHUYLKILL 0 1 2
UNIVERSITY_CITY 33 83 132
POWELTON 0 57 3
WEST_POWELTON 1 3 7
EAST_PARKSIDE 0 110 1
BELMONT 0 2 1
HAVERFORD_NORTH 0 1 0
WEST_PARKSIDE 0 0 1
MILL_CREEK 0 2 2
DUNLAP 0 0 1
MANTUA 4 24 15
HADDINGTON 4 1 1
CARROLL_PARK 0 0 3
OVERBROOK 0 469 18
WYNNEFIELD_HEIGHTS 0 0 3
CALLOWHILL 6 8 3
WEST_POPLAR 4 6 3
EAST_POPLAR 6 8 2
STRAWBERRY_MANSION 0 6 4
EAST_PARK 0 32 100
WYNNEFIELD 0 41 1
WEST_PARK 3 13 85
NORMANDY_VILLAGE 0 0 1
STADIUM_DISTRICT 0 0 49
NAVY_YARD 0 0 6
EAST_KENSINGTON 0 9 7
ELMWOOD 0 1 1
GIRARD_ESTATES 0 505 3
EASTWICK 19 0 7
PACKER_PARK 0 2 6
PENNSPORT 1 13 2
NEWBOLD 0 1 1
WEST_PASSYUNK 0 0 1
EAST_PASSYUNK 0 0 0
BLUE_BELL_HILL 0 8 6
# Join counts back to neighborhood data for visualization
neighborhoods_with_counts <- neighborhoods %>%
  left_join(neighborhood_counts, by = c("NAME" = "NEIGHBORHOOD"))

# Map
# Create composite count
neighborhoods_with_counts$amenity_total <- 
  neighborhoods_with_counts$public_art_count +
  neighborhoods_with_counts$historic_count +
  neighborhoods_with_counts$landmark_count

# Create map
ggplot(neighborhoods_with_counts) +
  geom_sf(aes(fill = amenity_total), color = "black", size = 0.2) +
  scale_fill_gradient(
    low = "#c6dbef",  # darker light blue
    high = "#08519c", 
    na.value = "white"
  ) +
  labs(
    fill = "Total Amenities",
    title = "Total Cultural Amenities per Neighborhood",
    subtitle = "Sum of Public Art, Historic Sites, and Landmarks"
  ) +
  theme_minimal() +
  theme(
    plot.title = element_text(face = "bold", size = 16),
    plot.subtitle = element_text(size = 12)
  )

# Make interactive with leaflet

library(leaflet)

leaflet(neighborhoods_with_counts) %>%
  addTiles() %>%  # base map
  addPolygons(
    fillColor = ~colorNumeric("Blues", amenity_total)(amenity_total),
    weight = 1,
    color = "black",
    fillOpacity = 0.7,
    popup = ~paste0("<strong>", NAME, "</strong><br>Total Amenities: ", amenity_total)
  ) %>%
  addLegend(
    pal = colorNumeric("Blues", neighborhoods_with_counts$amenity_total),
    values = ~amenity_total,
    title = "Total Amenities",
    position = "bottomright"
  )

Analysis requirements: - Clear code comments explaining each step - Appropriate CRS transformations - Summary statistics or counts - At least one map showing your findings - Brief interpretation of results (3-5 sentences)

Your interpretation: Upon looking at the interactive map it is evident that the Rittenhouse neighborhood has the greatest access to cultural amenities. My initial question was which neighborhoods lacked exposure and access to cultural amenities but upon doing the analysis it is evident that, nearly all of the neighborhoods in Philadelphia have a relatively low amount of cultural features, compared to neighborhoods in the city center. Objectively there may be many cultural features per neighborhood compared to other large cities in pennslyvania, but the only comparison we have here is within Philadelphia. The neighborhoods in Philadelphia following Rittenhouse with the greatest access to cultural amenities are: Old City, Society Hill, Spring Garden, Washington Square, Logan square and Graduate Hospital. It is not surprising that this concentration exists in and near the city center.

Finally - A few comments about your incorporation of feedback!

Take a few moments to clean up your markdown document and then write a line or two or three about how you may have incorporated feedback that you recieved after your first assignment. I made sure to remove any brackets and instruction text for the final submission, and I was more careful about duplicate columns when filtering and joining.


Submission Requirements

What to submit:

  1. Rendered HTML document posted to your course portfolio with all code, outputs, maps, and text
    • Use embed-resources: true in YAML so it’s a single file
    • All code should run without errors
    • All maps and charts should display correctly

File naming: LastName_FirstName_Assignment2.html and LastName_FirstName_Assignment2.qmd